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Abstract — We present methods to synthesize cooperative 
strategies for multi-vehicle control problems using mixed integer 
linear programming. Complex multi-vehicle control problems are 
expressed as mixed logical dynamical systems. Optimal strategies 
for these systems are then solved for using mixed integer linear 
programming. We motivate the methods on problems derived 
from an adversarial game between two teams of robots called 
RoboFlag. We assume the strategy for one team is fixed and 
governed by state machines. The strategy for the other team is 
generated using our methods. Finally, we perform an average 
case computational complexity study on our approach. 

Index Terms — Cooperative robotics, multi- vehicle systems, 
mixed integer linear programming, robot motion planning, path 
and trajectory planning, hybrid systems, mathematical optimiza- 
tion. 



I. Introduction 

For many problems, a team of vehicles can accomplish 
an objective more efficiently and more effectively than a 
single vehicle can. Some examples include target intercept [2], 
search [4], terrain mapping [24], object manipulation [38], 
surveillance, and space-based interferometry. For these prob- 
lems, it is desirable to design a multi-vehicle cooperative 
control strategy. 

There is a large literature on cooperative control. Work 
from team theory [25], [21] considers the case where team 
members have different information and the objective func- 
tion is quadratic. Cooperative estimation for reconnaissance 
problems is considered in [30]. In [32], [6], [16] mixed integer 
linear programming is used for multi-vehicle target assignment 
and intercept problems. Hierarchical methods are used for 
cooperative rendezvous in [27] and for target assignment 
and intercept in [2]. A review from the machine learning 
perspective is presented in [35]. There are several recent 
compilations of cooperative control articles in [1], [5], [28]. 

In this paper, we propose a hybrid systems approach for 
modeling and cooperative control of multi-vehicle systems. 
We use the class of hybrid systems called mixed logical 
dynamical systems [7], which are governed by difference 
equations and logical rules and are subject to linear inequality 
constraints. The main motivation for using mixed logical 
dynamical systems is their ability to model a wide variety 
of multi-vehicle problems and the ease of modifying problem 
formulations. In our approach, a problem is modeled as a 
mixed logical dynamical system, which we represent as a 

M. G. Earl (corresponding author), Theoretical and Applied Me- 
chanics Department, Cornell University, Ithaca, NY 14853, (e-mail: 
mgel@cornell . edu). 

R. D'Andrea, Mechanical and Aerospace Engineering Department, 
101 Rhodes Hall, Cornell University, Ithaca, NY 14853, (e-mail: 
rd28@cornell . edu). 



mixed integer linear program (MILP). Then, to generate a 
cooperative control strategy for the system, the MILP is solved 
using AMPL [19] and CPLEX [22]. 

Posing a multi-vehicle control problem in a MILP frame- 
work involves modeling the vehicle dynamics and constraints, 
modeling the environment, and expressing the objective. To 
demonstrate the modeling procedure and our approach, we 
consider control problems involving Cornell's multi-vehicle 
system called RoboFlag. For an introduction to RoboFlag, 
see the papers from the invited session on RoboFlag in the 
Proceedings of the 2003 American Control Conference [10], 
[12], [13]. 

Our focus is to find optimal solutions using a flexible 
methodology, which is why we use MILP. However, because 
MILP is in the NP-hard computation class [20], the methods 
may not be fast enough for real-time control of systems 
with large problem formulations. In this case, the methods 
can be used to explore optimal cooperative behavior and as 
benchmarks to evaluate the performance of heuristic or ap- 
proximate methods. In Section lVTl we discuss several methods 
for reducing the computational requirements of our MILP 
approach so that it can be more readily used in real-time 
applications. 

Our approach for multi-vehicle control, first presented 
in [15], [16], was developed independently from a similar 
approach developed by Richards et. al. [33]. Next, we list 
some of the noteworthy aspects of our approach. First, the 
environment which we demonstrate our methods involves 
an adversarial component. We model the intelligence of the 
adversaries with state machines. Second, our approach al- 
lows multiple, possibly nonuniform, time discretizations. Dis- 
cretizing continuous variables in time is necessary for MILP 
formulations. Using many time steps results in large MILPs 
that require a considerable amount of computation time to 
solve. Support for nonuniform discretizations in time allows 
the use of intelligent time step selection algorithms for the 
generation of more efficient MILP problem formulations [17]. 
Finally, because we include the vehicle dynamics in the 
problem formulation, the resulting trajectories are feasible, 
which is advantageous because they can be applied directly 
to the multi-vehicle system. In order to express the vehicle 
dynamics efficiently in our MILP formulation, we restrict the 
control input to each vehicle in a way that allows near-optimal 
performance, as presented in [29]. 

The paper is organized as follows: First, we consider vehicle 
problems that have relatively simple formulations. Then we 
add features in each section until we arrive at the RoboFlag 
multi-vehicle problems. In Section HI] we introduce the dy- 
namics of the vehicles used to motivate our approach, and 
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we formulate and solve a single vehicle minimum control 
effort trajectory generation problem. We build upon this in 
Section Ull] adding obstacles that must be avoided. In Sec- 
tion]^ we show how to generate optimal team strategies for 
RoboFlag problems. In Section|V] we perform an average case 
computational complexity study on our approach. Finally, in 
Section IVII we discuss our methods and ways in which they 
can be be applied. All files for generating the plots found in 
this paper are available online [18]. 

II. Vehicle dynamics 

Multi-vehicle control problems involving the wheeled robots 
of Cornell's RoboCup Team [14], [37] are used to motivate 
the methods presented in this paper In this section, we show 
how to simplify their nonlinear governing equations using a 
procedure from [29]. The result is a linear set of governing 
equations coupled by a nonlinear constraint on the control 
input, which admits feasible vehicle trajectories and allows 
near-optimal performance. We then show how to represent the 
simplified system in a MILP problem formulation. 

Each vehicle has a three-motor omni-directional drive, 
which allows it to move along any direction irrespective of its 
orientation. The nondimensional governing equations of each 
vehicle are given by 
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and lJ{t) = {Ux{t),Uy{t),Uz{t)) G U. In these equations, 
{x{t),y{t)) are the coordinates of the vehicle on the playing 
field, 9{t) is the orientation of the vehicle, u{9{t),t) is the 
6'(i)-dependent control input, m is the mass of the vehicle, / 
is the vehicle's moment of inertia, L is the distance from the 
drive to the center of mass, and Ui{t) is the voltage applied 
to motor i. The set of admissible voltages U is given by the 
unit cube, and the set of admissible control inputs is given by 
P{9)U. 

These governing equations are coupled and nonlinear To 
simplify them, we replace the set P{9)U with the maximal 9- 
independent set found by taking the intersection of all possible 
sets of admissible controls. The result is a ^-independent 
control input, denoted {ux{t),Uy{t),Uz{t)), that is subject to 
the inequahty constraints Ux{tY + Uy{t)^ < (3 — |ue(t)|)^/4 
and < 3. 

Using the restricted set as the set of allowable control inputs, 
the governing equations decouple and are given by 
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The constraints on the control input couple the degrees of 
freedom. 



To decouple the 9 dynamics, we set < 1. Then the 

constraint on the control input becomes 



Ux{tf+Uy{tf < 1. 
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Now the equations of motion for the translational dynamics 
of the vehicle are given by 



X{t)+Xit) = Ux{t), 

yit)+yit)^Uy{t), 
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subject to equation (0}. In state space form, equation (|5} is 
x(t) = Acx(t) + Bcu(i), where x — {x,y,i,y) is the state 
and u ~ {ux, Uy) is the control input. 

By restricting the admissible control inputs we have simpli- 
fied the governing equations in a way that allows near optimal 
performance as shown in [29]. This procedure allows real-time 
calculation of many near-optimal trajectories and has been 
successfully used by Cornell's RoboCup team [14], [37], [29]. 

To represent the governing equations in a MILP framework, 
we discretize the control input in time and require it be 
constant between time steps. The result is a set of linear 
discrete time governing equations. 

Let Nu be the number of discretization steps for the control 
input u(i), let i„[fc] be the time at step k, and let Tu[k] > be 
the time between steps k and k + 1, for k e {0, . . . , Nu — 1}. 
The discrete time governing equations are given by 
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x„[fc] = {xu[k],yu[k],Xu[k],yu[k]), and u[k] = 
{ux[k],Uy[k]). The coefficients A[k] and B[fc] are functions 
of k because we have allowed for nonuniform time 
discretizations. Because there will be several different time 
discretizations used in this paper, we use subscripts to 
differentiate them. In this section, we use the subscript u 
to denote variables associated with the discretization in the 
control input u{t). 

The discrete time governing equations can be solved explic- 
itly to obtain 
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Xu[k] = e-*"['^'li„[0] 

1=0 



and similarly for and 

In later sections of this paper it will be necessary to 
represent the position of the vehicle at times between control 
discretization steps, in terms of the control input. Because the 
set of governing equations is linear, given the discrete state 
Xijfc] and the control input u[fc], we can calculate the vehicle's 
state at any time t using the following equations: 
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where k satisfies tu[k] < t < t„[fc + 1]. If the time 
discretization of the control input is uniform, Tu[fc„] — Tu 
for all then fc„ = [t/T„J. The other components of the 
vehicle's state, y{t) and y{t), can be calculated in a similar 
way. 

The control input constraint given by equation @ cannot 
be expressed in a MILP framework because it is nonlinear. 
To incorporate this constraint we approximate it with a set of 
linear inequalities that define a polygon. The polygon inscribes 
the region defined by the nonlinear constraint. We take the 
conservative inscribing polygon to guarantee that the set of 
allowable controls, defined by the region, is feasible. We define 
the polygon by the set of Af„ linear inequality constraints 
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for each step k E {1, . . . , iV„}. 

To illustrate the approach, we consider the following min- 
imum control effort trajectory generation problem. Given a 
vehicle governed by equations (|6} and (|8}, find the sequence 
of control inputs {u[A;]}^g^^ that transfers the vehicle from 
starting state x(0) = to finishing state x(</) = x/ and 
minimizes the cost function 
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To convert the absolute values in the cost function to linear 
form, we introduce auxiliary continuous variables Zx[k] and 
Zy [k] and the inequality constraints 
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Minimizing Zx[k] subject to the inequalities Ux[k] < Zx[k] and 
Ux[k] > —Zx[k] is equivalent to minimizing |u2;[fc]| (similarly 




Fig. 1. Plots of the minimum control effort example. Figure (a) shows 
the vehicle trajectory in the {x, y) plane. The circle and dotted line denote 
the initial position and velocity, respectively. The square denotes the final 
position. Figures (b)-(d) show the time histories of the control inputs, the 
positions, and the velocities, respectively. The solid lines in Figures (b)-(d) 
represent x components and the dotted lines represent y components. The 
values for the parameters are A'^^ = 10, Mu = 20, Ttj[/c] = 0.3 for all 
k, ixo,yo,xo,yo) = (-0.25,-0.2,-0.5,-0.2), and (x j ,y f ,x f ,y f) = 
(0.4,0.3,0.0,0.0). 



for |uy[fc]|) [8]. Using the auxiliary variables, the cost function 
can be written as a linear function. 
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The resulting optimization problem (minimize il l\ subject 
to (|6jl, (jsjl, ( I lot , and the boundary conditions) is in MILP 
form. Because binary variables do not appear in the problem 
formulation, it is a linear program and is easily solved to obtain 
the optimal sequence of control inputs. The solution for an 
example instance is shown in Figure ^ 

III. Obstacle avoidance 

In vehicle control, it is necessary to avoid other vehicles, 
stationary and moving obstacles, and restricted regions. In this 
section, we show how to formulate and solve these avoidance 
problems using MILP. We start by showing a MILP method 
to guarantee circular obstacle avoidance at N^, discrete times. 
The version of this method developed in [34], and a similar 
version developed independently in in [15], [16], uniformly 
distributes obstacle avoidance times. Here we present a version 
of the method that allows nonuniform distributions of obstacle 
avoidance times. 

The subscript o is used to denote variables associated with 
the time discretization for obstacle avoidance. For step k, taken 
to be an element of the set {1, . . . , No}, let to[k] be the time 
at which obstacle avoidance is enforced. Let Robst denote 
the radius of the obstacle. Let {xobst[k],yobst[k]) denote the 
coordinates of its center at time to[k]. We approximate the 
obstacle with a polygon, denoted 0[k], defined by a set of 
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Mo inequalities. The polygon is given by 
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To guarantee obstacle avoidance at time to[k], the coordi- 
nates of the vehicle must be outside the region 0[k]. This 
avoidance condition can be written as (xo[fc], 2/o[fc]) ^ 0[k], 
where {xo [k] , yo [k] ) are the coordinates of the vehicle at 
time to[k]. Here Xo[k] = x{to[k]) and yo[k] ~ y{to[k]) are 
expressed in terms of the control inputs using equation 0. 

Because at least one constraint defining the region 0[k] 
must be violated in order to avoid the obstacle, the avoidance 
condition is equivalent to the following condition: 

there exists an m such that 

{xo[k] - Xobst[k]) sin 



Mo 
27rm 
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To express this avoidance constraint in a MILP problem 
formulation, it must be converted to an equivalent set of linear 
inequality constraints. We do so by introducing auxiliary bi- 
nary variable b„i[k] E {0, 1} and the following Mo inequality 
constraints. 
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where iJ is a large positive number taken to be larger than the 
maximum dimension of the vehicle's operating environment 
plus the radius of the obstacle. If bm [k] — 1, the right hand side 
of the inequality is a large, negative number that is always less 
than the left hand side. In this case, the inequality is inactive 
because it is trivially satisfied. If bmlk] = 0, the inequality is 
said to be active because it reduces to an inequality from the 
existence condition above. For obstacle avoidance, at least one 
of the constraints in equation (I14> must be active. To enforce 
this, we introduce the following inequality constraint into the 
problem formulation. 
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Therefore, to enforce obstacle avoidance at time to[k], 
set of binary variables {fcm[fc]}m=i '^^e constraints given 
by equations (I14> and il5\ are added to the MILP problem 
formulation. 

Consider the example problem from Section |ll] with obsta- 
cles. In this problem, we want to transfer the vehicle from 
start state to finish state in time tf using minimal 
control effort while avoiding obstacles. To enforce obstacle 
avoidance at each time in the set {io[fc]}£i, we augment the 





Fig. 2. Plots of the minimum control effort obstacle avoidance example. The 
shaded region is the obstacle to be avoided and the X 's along the trajectory 
denote the avoidance points {xo[k],yo[k]). Figure (a) is the original solution. 
Figure (b) is the solution after two steps of the iterative obstacle avoidance 
algorithm. The O's are the avoidance points added to the MILP formulation 
by the iterative algorithm. The values for the parameters are Nu = 10, 
Mu = 20, Tu[k] = 0.3 for all k. Mo = 10, No = 10, to[k] = kT, 
(xs,ys,Xs,ys) = (—0.25, —0.2, —0.5, —0.2), and {x j ,yf ,x j ,yf) = 
(0.4,0.3,0.0,0.0). 



MILP formulation in Section|n]with the set of binary variables 
{^m[fc]}m=ii constraints ( I14> . and constraint dlSI l for all k in 
the set {1, . . . , No}. 

Distributing the avoidance times uniformly (uniform grid- 
ding), as in [34], [16], results in a trajectory that avoids 
obstacles at each discrete time in the set, but the trajectory 
can collide with obstacles between avoidance times. This 
is shown for an example instance in Figure |2ja). In this 
example, the trajectory intersects the obstacle between the 
sixth and seventh avoidance time steps. A simple method 
to eliminate this behavior is to represent the obstacle with 
a polygon that is larger than the obstacle. Then distribute 
obstacle avoidance times uniformly such that the sampling 
time is small enough to guarantee avoidance. In general, this 
is not a desirable approach because it results in large MILPs 
that require significant computational effort to solve. 

A better approach is to select the avoidance times intelli- 
gently. In [17], we have developed an iterative MILP algorithm 
that does this. We summarize this algorithm here. First, pick 
an initial set of times {io[fc]}fc^^^i at which obstacle avoidance 
will be enforced. Then, formulate and solve the MILP as 
described above representing the obstacles with polygons 
slightly larger than the obstacles. Next, check the resulting 
trajectory for collisions with any of the obstacles (not the 
polygons which represent them in the MILP). If there are 
no collisions, terminate the algorithm. If there is a collision, 
compute the time intervals for which collisions occur denoting 
the time interval for collision i by [t\' ,v^'). For each interval 
i, pick a time within the interval, such as {t\^^ + t^*-')/2. At 
each of these times add obstacle avoidance constraints to the 
MILP formulation. Then, solve the augmented MILP repeating 
the procedure above (first checking if the resulting trajectory 
intersects any obstacles, etc.) until a collision free trajectory 
is found. Figure |2jb) shows the effectiveness of this algorithm 
after two iterations. 
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Fig. 3. The RoboFlag Drill used to motivate the methods presented is this 
paper. The drill takes place on a playing field with a Defense Zone at its 
center. The objective is to design a cooperative control strategy for the team 
of defending vehicles (black) that minimizes the number of attacking vehicles 
(white) that enter the Defense Zone. 




(in Defenze Zone) 
or (intercepted) 



f inactive^ 
la[k]=0 



otherwise 



any 



Fig. 4. The two state (attack and inactive) attacker state machine. The attacker 
starts in the attack state. It transitions to the inactive state, and remains in this 
state, if it enters the Defense Zone or if it is intercepted by the defender. 



Vfce{i,...,7VJ. 



(16) 



The attacker has two discrete modes: attack and inactive. 
When in attack mode, it moves toward the Defense Zone 
at constant velocity along a straight line path. The attacker, 
initially in attack mode at the beginning of play, transitions to 
inactive mode if the defender intercepts it or if it enters the 
Defense Zone. Once inactive, the attacker does not move and 
remains inactive for the remainder of play. These dynamics 
are captured by the following discrete time equations and state 
machine 



IV. RoboFlag Problems 

To motivate our multi-vehicle methods, we apply them to 
simplified versions of the RoboFlag game [12], [13], [10], 
which we call RoboFlag Drills because they serve as practice 
for the real game. The drills involve two teams of robots, 
the defenders and the attackers, on a playing field with a 
circular region of radius Rdz at its center called the Defense 
Zone, as shown in Figure |3] The attackers' objective is to 
fill the Defense Zone with as many attackers as possible. 
The defenders' objective is to deny as many attackers from 
entering the Defense Zone as possible without entering the 
zone themselves. We consider Defensive Drill problems in 
which each attacker has a fixed intelligence governed by a 
state machine. The goal is to design a team control strategy for 
the defenders that maximizes the number of attackers denied 
from the Defense Zone. In this section, we use MILP methods 
to generate such strategies. We consider two versions of the 
Defensive Drill each with a different set of laws governing 
attacker intelligence. 

To start, we consider one-on-one Defensive Drill problems. 
This is the simplest case and involves one defender and one 
attacker Although this case is not particularly interesting, we 
start with it for notational clarity. Next, we generalize to the 
case involving Nu defenders and Na attackers, which is a 
straightforward extension. 

A. Defensive Drill 1: one-on-one case 

The defender is governed by the discrete time dynamical 
system from Section HH 

x„[fc + 1] = A[fc]x„[fc] + B[fc]u[A:] 
x„ [0] = Xs 
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p[k + 1] = p[k] + VpTa[k]a[k] 
q[k + 1] = q[k] + VgTa[k]a[k] 



(17) 



a[k + 1] 



if (a[k] = 1) 

and (not in Defense Zone) 
and (not intercepted) 
if (a[k] = 0) 
or (in Defense Zone) 
or (intercepted) 



(18) 



Vfc e {!,..., TVa} 



with initial conditions 



p[0] = Ps, q[0] = Qs, and a[Q] = 1, 



(19) 



where Na is the number of samples, k £ {!,..., TVq}, 
Ta[k] > is the time between samples k and fc + 1, {p[k], q[k]) 
is the attacker's position at time ta[k] — J2i=o ^aW' i^p^'^q) 
is its constant velocity vector, and a[k] e {0, 1} is a discrete 
state indicating the attacker's mode. The attacker is in attack 
mode when a[k] = 1 and inactive mode when a[k] = 0. The 
attacker state machine is shown in Figure |3 Here we use the 
subscript a to denote the time discretization for the attacker's 
dynamics. 

Defense Zone 

Because the defender wants to keep the attacker from enter- 
ing the Defense Zone, a binary variable indicating whether or 
not the attacker is inside the Defense Zone is introduced. This 
variable is used to define the attacker state machine precisely. 
We denote the binary variable with 7[fc] e {0, 1}. When the 
attacker is in the Defense Zone at step k, 7[fc] — 1. When the 
attacker is outside the Defense Zone at step fc, jlk] — 0. 

Similar to the approach used to define obstacles, the Defense 
Zone is approximated using a polygon Q defined by a set of 
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Mdz inequalities 
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The association between 7[fc] and is 



(7[fc] = 1) 



(20) 



(21) 



If the defender keeps the binary variable jlk] equal to for 
all k e {1, • • ■ ,Na}, it has successfully denied the attacker 
from the region Q and thus from the Defense Zone. However, 
in order to use the binary variable jlk] in the problem 
formulation, we must enforce the logical constraint given by 
equation ( I2U . To enforce this constraint in MILP, we convert 
it into an equivalent set of inequality constraints. 

We introduce the binary variable gm[k] G {0, 1} to indicate 
whether or not the mth constraint of Q is satisfied by the 
attacker with position g[fc]). This association is made 

by introducing the logical constraint 



{gm[k] = 1) 

■ 27rm 
p[k\ sm - 



27rm 
cos ^1^— < Rdi 



(22) 



Mdz Mdz 

As shown in Appendix HJ it is equivalent to the following set 
of inequalities 
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where e is a small positive number and is a large positive 
number such that the left hand sides of the inequalities are 
bounded from above by H and from below hy —H. 

Using binary variable gm[k], we can write equation Mil as 



(7[fc] = 
i.9m[k] 



1 Vme {!,..., Md J), 
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which is equivalent to the inequality constraints 



5„[fc]-7[fc]>0 Vme{l, 
(l-g,[fc])+7[fc] > 1, 



, Mdz} 



i=l 



(25) 



as shown in Appendix HI 

The logical constraint given by equation J21> is equivalent 
to the inequality constraints given by equations ( I23> and M5\ . 
Therefore, we can include the indicator variable ^[k] in the 
MILP formulation by also including the binary variables 
{5m[fc]}m=i constraints j23t and ( 125 > . 

Intercept Region 

To define what it means for a defender to intercept an 
attacker, we introduce an intercept region I[k] rigidly attached 
to the defending robot. The intercept region is a polygon 



defined by a set of Mj inequalities. If an attacker is in this 
polygon, it is considered intercepted. For the defender with 
coordinates (a;a[/c], the intercept region is given by 



{x — Xa [k] ) sin 
Vm e {1, 
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M, 
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where Xa[k] = x{ta[k]) and ya[k] = y{ta\k]) are calculated 
using equation 0, and i?/ is the inscribed radius of the 
polygon. 

The binary variable 5[k] G {0, 1} is introduced to indicate 
whether or not the attacker is inside the defender's intercept 
region. The association is made with the following logical 
constraint 



5[k] 



1 



(p[fc],g[fc])ex[fc] 



(27) 



Using techniques similar to those used for 7[fc], we convert 
this constraint into an equivalent set of inequality constraints 
as follows: For each of the inequalities defining the intercept 
region, we associate a binary variable dm[k] G {0,1} by 
introducing the logical constraint 



I) 



Ca [k])i 



27rm 
Mi 



, .,, 27rm 
{(l[k\ - ya[k\) cos -j^ < Ri 



(28) 



where {p[k],q[k]) are the coordinates of the attacking robot. 
If dm[k] ~ 1, the coordinates of the attacking robot satisfy 
the TOth inequality defining the intercept region. If dm [k] — 0, 
the TOth inequality is not satisfied. Similar to equation ( I22t . 
we can express this logic constraint as the set of inequalities 



{p[k] - Xa[k]) sin 
<Ri+H{l- 



27rTO 



Mi 

dm [k] ) 

{p[k] - Xa[k]) sin + {q[k] 



m~ya[k])' 



> Ri 



Mi 

{H + e)dm[k]. 



ya[k])' 



2'Km 
Mi 

27rTO 
Mi 



(29) 



Using the binary variables dm[k\ we can write equation ( l27l 
as 



m 



1 



{dm[k] = l VtoG {!,..., Af/}). (30) 



Considering both directions of this equivalence, as done for 
7[fc] above, we find that the statement is equivalent to the 
following set of inequality constraints 



dm[k]~5[k]>Q Vtog{1, 
{I - d,[k]) + 5[k] > 1. 



,Mi} 



All 

E 



(31) 



We can use S[k] in our problem formulation if we include the 
binary variables {dm[k]}m=i ™d the inequalities constraints 
given by equations ( I29> and ( 13 1> . 
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Y[a]=l or 8[k]=l 



a[k]=l 




Y[a]=0 and 5[k]=0 



Fig. 5. The two state attacker state machine, as in Figure |4] using binary 
variables S[k] and 7[fc]. 



State machine and objective function 

With the indicator variables j[k] and d[k] defined, we revisit 
the attacker state machine and define it more precisely with 
the following state equation 



1 







a[k+l] 



(32) 



if a[A;] = 1 and j[k] 
and S[k]^0 
if a[k] = or j[k] = 
or d[k] = 1, 

which is shown in Figure|5l The state equations can be written 
as the logical expression 



{a[k- 
{a[k] 



1] - 1) ^ 
= 1 and 7[fc] 



and S[k] ^ 0), (33) 



which is equivalent to the set of inequality constraints 

a[k + 1] + ^[fc] < 1 
a[k + 1] - a[k] < 
a[fc+ 1] +7[fc] < 1 
a[k] - S[k] - "f[k] - a[k + 1] < 0, 



(34) 



as shown in Appendix U 

We have defined the dynamics of the defenders and the 
attackers, and we have converted these dynamics to MILP 
form. The final step in formulating Defensive Drill 1 is to 
introduce an objective function that captures the defender's 
desire to deny the attacker from entering the Defense Zone. 
This objective is achieved by minimizing the binary variable 
7[fc] over the duration of the drill, which is captured by 
minimizing the function 



j = J2i[k]. 



k=l 

We set the duration of the drill such that 



ta[Na] > 



IPs 



(35) 



(36) 



This allows the attacker enough time to reach the Defense 
Zone if it is not intercepted. 

In addition to intercepting the attacker, we would like to 
conserve fuel as a secondary objective. One way to achieve this 
is to add a small penalty on the control effort to the objective 
function as follows: 



J 



E 

fc=i 



k=Q 



\Uy[k]\), 



(37) 



where e is taken to be a small positive number because 
we want the minimization of the binary variable jlk] to 




Fig. 6. The defender denies the attacker from entering the Defense Zone. 
The *'s along the attacker's trajectory denote the attacker's position for each 
time t a [fc] . The polygons along the defender's trajectory denote the intercept 
region I[k] for each time ta[fc]. The x's denote obstacle avoidance points. 



dominate. We use the procedure outlined in Section |lll to 
write equation (I37> in MILP form. 

Summary and example 

We have formulated Defensive Drill 1 as the following 
optimization problem: minimize (I37> subject to defender 
dynamics il6\ : attacker dynamics illl . ( I19t . and i34\ for 
all /c G {l,...Na}', the constraints for the j[k] indicator 
variable (ED and for all m e {!,..., Md^} and for 
all k e {!,..., iVa}; the constraints for the 6[k] indicator 
variable (|29j, and (ED for all m e {!,..., M/} and for 
all k E {1, . . . , Na}; and the avoidance constraints for the 
Defense Zone J14t and (I15> for all m E {!,..., Mo} and for 
all fee {!,..., iVj. 

The solution to an instance of this problem is shown in 
Figure |6l For this instance, the defender denies the attacker 
from the Defense Zone (shaded polygon) by intercepting 
it. Each asterisk along the attacker's trajectory denotes the 
position of the attacker at sample time ta[k]. The white 
polygons along the defender's trajectory denotes the intercept 
region T[k] at time ta[k]. 

B. Defensive Drill 2: one-on-one case 

The dynamics of the defender are the same as the defender 
dynamics in Section HV-AI The dynamics of the attacker are 
the same as the attacker dynamics in Defensive Drill 1, but 
with an additional state called retreat. If a defender is too close 
to the attacker, the attacker enters the retreat state and reverses 
direction. While retreating, if the defender is far enough away, 
the attacker returns to attack mode. 

These dynamics are captured by the following discrete time 
equations and state machine: 

p[k + 1] = p[k] + VpTa[k]ai[k] 

- VpTa[k]a2[k] 

q[k + l]^q[k]+VqTa[k]ai[k] 

- v^Ta[k]a2[k] (38) 
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defender in 
warning region 



Fig. 7. The thi'ee state (attack, inactive, and retreat) attacker state maciiine. 
Tlie attacker starts in tlie attack state. 



(a) (b) 




0.5 1 5 10 

X time 

Fig. 8. Tile solution to an instance of Defensive Drill 2. Figure (a) shows the 
playing field. Each polygon along the attacker's trajectory is a warning region. 
Figure (b) shows the attacker's distance from the center of the Defense Zone 
versus time. The parameters are Mo = 8, Mu = 4, A/j = 8, Mw = 8, 
Maz = S, Nu = 4, No = i, Na = 10, xq = (-.36, .02, -.12, -.22), 
and (po, qo,Va) = (1.15, -.4, -.11). 



ai[fc + 1] 



a2[fc + l] = 



1 if [(ai[/c] = 1 and a2[A;] = 0) 
or iai[k] — and 02 [fc] — 1)] 
and (not scored) 
and (not intercepted) 
and (not in warning region) 

otherwise 

1 if [(ai[fc] = and a2[/c] = 1) or 
(ai[fc] = 1 and 02 [fc] = 0)] 

and (not scored) 
and (not intercepted) 
and (in warning region) 
otherwise 



(39) 



(40) 



for all /c e {1, . . . , Na} and subject to the constraint 

ai[k] + a2[k] < 1 (41) 
and the initial conditions 



p[0] = Ps, q[0] - Qs, ai[0] = 1, 02 [0] - 0. 



(42) 



The state machine is shown in Figure 

The attacker needs two binary state variables because it 
has three discrete modes of operation: attack, retreat, and 
inactive. These modes are represented by (ai[fc],a2[fc]) = 
(1,0), (ai[fc],a2[fc]) = (0,1), and (ai[fc], a2[fc]) = (0,0), 
respectively. Because the state (ai[/c], a2[A:]) = (1,1) is 
not needed, we impose the inequality constraint given by 
equation i41l . 

To determine if the defender is too close to the attacker, a 
warning region is introduced. The warning region W[k] is a 
polygon defined by a set of inequalities similar to the intercept 
region T[k]. We introduce binary auxiliary variables Wm[fc] 
and uj[k], and we introduce inequality constraints similar to 
equations j29> and j31> . The result is the following association 
for indicator variable uj[k]: If uj[k] — 1, the defender is in the 
attacker's warning region at step k, otherwise, uj[k] = 0. The 
attacker state machine can be written as the set of inequality 



constramts 

ai[k + 1] - ai[k] + 02 [fc] + j[k] + S[k] + uj[k] > 

ai[k + 1] + ai[k] - 02 [fc] + j[k] + 6[k] +Lu[k]>0 

a2[k + 1] + ai[k] - a2[k] + j[k] + S[k] - uj[k] > -1 

a2[k + 1] - ai[k] + a2[k] + j[k] + 6[k] - Lu[k] > -1 

ai[k + 1] + ai[k] + a2[k] < 2 

ai[k + 1] - ai[k] - a2[k] < 

ai[fc + 1] +7[fc] < 1 

ai[k+l] + 6[k] < 1 

ai[k + 1] + uj[k] < 1 

a2[fc+ 1] - ai[k] - a2[k] < 

a2[k + 1] + ai[k] + a2[k] < 2 

a2[fc + 1] +7[fc] < 1 

a2[k + l] + S[k] < 1 

a2[k+l]~Lu[k]<0. (43) 

Similar to Defense Drill 1, Defense Drill 2 can be posed 
as a MILP. The solution of an example instance is shown in 
Figure IS] 



C. Nd-oti-Na case 

To generalize the problem formulation to Nn defenders 
and Na attackers, we need to add more variables. Defender 

1, where i G {1, . . . , iV^j}, has state x„i[/c], control input 
Uj[fc], and slack variables Zxi[k] and Attacker j, where 
i E {1, . . . , Na}, has state (p^ [fc] , [A:] , [fc] ) and constant 
velocity vector {vpj,Vqj). The binary variable jj[k] equals 1 
if and only if attacker j is in polygon Q at step k. The binary 
variable Sij [k] equals 1 if and only if attacker j is in defender 
i's intercept region [k] at step k. The binary variables g,nj [k], 
dmij[k], and bmi[k] follow a similar trend. For Defensive Drill 

2, we also need a;ij [A:] = 1 if and only if defender i is in 
attacker j's warning region at step k. And similarly for 
the binary variable Wmijik]- 

With the variables for the general case defined, inequality 
constraints are added to the formulation in a similar way 





Fig. 9. The solution to four instances of Defensive Drill 1 . For Figures (a) 
and (b), Na = 3 and No = 1- For Figures (c) and (d), iV^ = 4 and 
Nd = 2. In Figures (a) and (c), the defenders deny all attackers from the 
Defense Zone. In Figures (b) and (d), an attacker enters the Defense Zone. 



as those derived for the one-on-one case. For example, the 
constraints identifying jj [k] = 1 with {pj [k] , qj [k]) E Q are 

9mj [fc] - 7j [fc] > Vm e {1, . . . , Md,} 

^(l-gz,[fc])+7,[fc] > 1, 
1=1 

for all fc e {1, . . . , Na} and for all j e {I, . . . , Na}- 
And finally, the objective function is given by 

Na Na Nd JVu-1 

■^ = EE^^W+^E E iKM + \Uy^[k]\). (44) 

j = l k=l 4=1 fc=0 

Results for example instances of Defensive Drill 1 are 
shown in Figure |9] Results for Defensive Drill 2 are shown in 
Figure [TO] 

V. Average case complexity 

In this section, we explore the average case complexity of 
Defensive Drill 1 by solving randomly generated instances. 
Each instance is generated by randomly picking parameters 
from a uniform distribution over the intervals defined below. 
Each MILP is solved using AMPL [19] and CPLEX [22] on a 
PC with Intel PHI 550MHz processor, 1024KB cache, 3.8GB 
RAM, and Red Hat Linux. For all instances solved, processor 
speed was the limiting factor, not memory. 

To generate the initial condition [ps , qs ) and the constant 
velocity vector (up, Vq) for each attacker we pick r^, 9a, and 
Va randomly from a uniform distribution over the intervals 
[^min^^max]^ [0, 27r), and [w^"'", respectively. The pa- 

rameters are then given by 





Fig. 10. The solution to two instances of Defensive Drill 2. Figures (a) and 
(c) show the playing field. Figures (b) and (d) show each attacker's distance 
from the center of the Defense Zone versus time. 



To generate the initial condition {xs 
defender, we pick rd, 0d, Vd, and 9 
uniform distribution over the intervals 



ys,is,ys) for each 
, randomly from a 

^3"'"'^^' [0,27r), 



and [0, 27r), respectively. The defender's initial 



condition is then given by 



Xs = rd cos tid, Vs = Td sm tid 
Xs = Vd cos 9v, Vs ^ Vd sin 9v . 



(46) 



Ps = raCOSUa, q, 
Vp = VaPi 



VaQs 



(45) 



We take the playing field to be circular with radius Rf = 15, 
and we set the radius of the Defense Zone to be Rdz — 
2. The intervals used to generate the instances are rd G 
[V2Rdz,2V2Rdz], Vd e [0.5,1.0], and ra G [i?//2,i?/]. We 
take the attacker velocity to be Va = 1.0. 

In Figure we plot the fraction of instances solved versus 
computation time for the case where the cost function includes 
the number of attackers that enter the Defense Zone plus 
a penalty on the control effort. This cost function is given 
by equation (I44> with e = 0.1. In Figure El we plot the 
fraction of instances solved versus computation time for the 
case where the cost function includes only the number of 
attackers that enter the Defense Zone. This cost function is 
given by equation i44\ with e = 0. 

As these figures show, the case where the cost function 
only includes the number of attackers that enter the Defense 
Zone is less computationally intensive than the case where the 
cost function also includes a penalty on the control effort. For 
example, for the case where e = 0, Nd = 3, and Na = 4, 
50% of the problems are solved in 3.5 seconds. However, for 
the case where e = 0.1, Nd = 3, and Na = 4, 50% of the 
problems are solved in 78 seconds. 

When e = in the cost function, the solver stops once a 
set of trajectories for the defenders is found that results in the 
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Fig. 11. Fraction of instances solved versus computation time. The cost 
function is the number of att acke rs that enter the Defense Zone plus a penalty 
on control effort (equation 1441 with e > 0). We consider instances with 
defender set of size three (Nj} = 3) and attacker sets of size = 
1, 2, 3, 4, 5. To generate each curve, 200 random instances were solved. The 
parameters are M„ = 4, M/ = 4, M^^ = i, Nu = 15, and Na = 15. 



Fig. 12. Fraction of instances solved versus computation time. The cost 
function is the number of attackers that enter the Defense Zone (equation 1441 
with e = 0). We consider instances with defender set of size three (No = 3) 
and attacker sets of size Nj\ = 3,4,5,6,7. To generate each curve, 200 
random instances were solved. The parameters are Mu = 4, Mj = 4, Mdz = 
4, Nu = 15, and Na = 15. 



minimum number of attackers entering the Defense Zone. This 
trajectory is often not the most efficient trajectory with respect 
to the control effort of the defenders. For the case where e — 
0.1, once a set of defender trajectories is found that results in 
the minimum number of attackers entering the Defense Zone 
the solver continues to search. The solver searches until it 
finds a set of defender trajectories that not only results in the 
minimum number of attackers entering the Defense Zone but 
also uses the defender control effort in the most efficient way. 

In Figure \^ we plot the computation time necessary to 
solve 50% of the randomly generated instances versus the size 
of the attacker set. For all instances considered, the defender 
set is of constant size (iV^i ~ 3). We plot the data for both 
cost functions considered above (e = and e = 0.1). This 
figure shows that the computation time grows exponentially 
with the number of vehicles in the attacker set. 

VI. Discussion 

In this paper, we showed how to use MILPs to model and 
generate strategies for multi- vehicle control problems. The 
MILP formulation is very expressive and allows all aspects 
of the problem to be taken into account. This includes the 
dynamic equations governing the vehicles, the dynamic equa- 
tions governing the environment, the state machine governing 
adversary intelligence, logical constraints, and inequality con- 
straints. The solution to the resulting MILP is the optimal 
team strategy for the problem. As shown by our average case 
complexity study, the MILP method becomes computational 
burdensome for large problems and thus, for these cases, is 
not suitable for real-time applications. 

There are several steps that can be taken to make the 
MILP approach applicable for real-time multi-vehicle control 
applications. One approach is to solve the MILPs faster. 
This can be done by using a more powerful computer or 



by distributing the computation over a set of processors. For 
the multi-vehicle control problems, it may be advantageous to 
distribute the computation over the set of vehicles. Distributed 
methods for solving MILPs have been considered in [31]. 

Another approach is to move all, or some components 
of, the computational burden offline. This can be done by 
formulating the problem as a multi-parametric MILP. A multi- 
parametric MILP is a MILP formulated using a set of param- 
eters each allowed to take on values over an interval. This 
problem is much more computationally intensive than the 
MILP problems considered in this paper. However, because 
the computation is performed offline, this is not an issue 
unless the computation takes an unreasonable amount of time. 
With the solution to the multi-parametric MILP, a table can 
be formed and used to look up optimal team strategies for 
the multi-vehicle system in real time. Multi-parametric MILP 
control problems can be solved using the Multi-Parametric 
Toolbox [23]. 

Finally, we discuss efficient MILP formulations as a way 
of decreasing computational requirements. The computational 
effort required to solve a MILP depends most on the number 
of binary variables used in its MILP problem formulation. 
Thus, reducing the number of binary variables is advantageous. 
In this paper, our motivating problem required three different 
time discretizations. The discretizations included one for the 
control input to the vehicles, one for obstacle avoidance, and 
one for attacker intercept. With each discretization step, a set 
of binary variables must be added to the MILP formulation. 
In most of the instances solved in this paper, we discretized 
time uniformly. However, we posed the MILP in a general 
way such that nonuniform time discretizations could be used. 
This allows for intelligent time distribution selection, which 
can significantly reduce the required computational effort to 
solve a problem. In [17], we developed several iterative MILP 
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Fig. 13. Computation time necessary to solve 50% of the randomly generated 
instances versus the number of attackers. For all instances, the defender set is 
of size three (N£) = 3). Each square denotes a data point for the case where 
the cost function is the number of attackers that enter the Defense Zone plus 
a penalty on each defender's control effort (e = 0.1 case). Each asterisk 
denotes a data point for the case where the cost function is the number of 
attackers that enter the Defense Zone (e = case). The solid lines denote 
fitted exponential functions to these data. 

techniques for intelligent discretization step selection. 
Appendix I 

In this appendix, we illustrate how to convert a logic 
expression into an equivalent set of inequalities using the 
procedure from [40]. First the logic expression is converted 
into a conjunction of disjunctions, called conjunctive normal 
form (CNF), by applying tautologies. For example, let the 
variables pi G {0, 1} be binary variables. The expression 

((pi = 1) or ip2 = 1)) 

and ((p3 = 0) or {p2 = 1)) 

and ((p4 = 1) or (ps = 1) or {pe = 0)) (47) 

is in CNF 

Once we have converted the logic expression into CNF we 
can easily write each disjunction as an inequality constraint 
that must be satisfied in order for the disjunction to be true. For 
example, the second disjunction of equation ( 147 1 . {{ps ~ 0) or 
{p2 = 1)), is true if and only if {I — ps) + P2 > 1- Therefore, 
equation ^\ is true if and only if the following inequalities 
hold 

Pi + P2 > 1 
(I-P3) +P2 > 1 

P4+P5 + (l-P6) > 1. (48) 

A. Equation i l24t 

First consider the (^) direction of equation i24\ . Replacing 
implication with disjunction we have 

(7[fca] = 1) or 

{9m[ka]^l Vme {l,...,Md.}) (49) 



and distributing the OR we have 

(7[A:J =0or5„,[fc,] - 1) Vm G {1, . . . , M^J (50) 

which is equivalent to 

{j[ka] ^ or gi[ka] ^ 1) 
and {'y[ka] = or g2[ka] = 1) 

and {'-f[ka] = or gM^J^a] = 1) • (51) 

This expression is in CNF, therefore it is equivalent to the 
following set of inequality constraints 

(1 - 7[fca]) + 9m[ka] > 1 Vm G {1, . . . , Md,}. (52) 

Now consider the (-^) direction of equation ( I24t . Replacing 
implication with disjunction 

i-f[ka] = 1) or 

^i9m[ka] = 1 Vm G {1, . . . , Md,}) (53) 
and distributing the negation we have 

{l[ka] = 1) or {3m such that g,„\ka] = 0) . (54) 
which is equivalent to 

il[ka] = 1) 

or (5i[fca] =0) 
or (.g2[fca] =0) 

or (gAU^ [ka] = 0) . (55) 

This expression, which is a disjunction, is in CNF and there- 
fore is equivalent to the following inequality 

J2i^-9^[ka])+-f[ka]>l■ (56) 
i=l 

B. Equation IU31 

First consider the (=>) direction of equation (13 3> . Replacing 
implication with the equivalent disjunction we have 

{a[ka + 1] = 0) or 

{a[ka] = 1 and j[ka] = and S[ka] = 0) (57) 

and distributing OR over AND we have 

ia[k, + 1] = or S[k,] = 0) 

and {a[ka + 1] = or a[ka] = 1) 

and {a[ka + 1] = or j[ka] = 0) . (58) 

Now that the expression is in CNF we can easily write it as 
a set of inequality constraints. 

{1 - a[ka + 1]) + {I - 6[ka]) > 1 

(1 - a[ka + 1)) + a[ka] > 1 

(l-a[fc„ + l)) + (l-7[A:„]) > 1. (59) 
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Now consider the other direction (<;=) of equation j33> . 
Replacing the implication with disjunction we have 

{a[ka + 1] = 1) or 

^{S[ka]=0 and a[ka] = 1 and -f[ka] = 0) (60) 
and distributing the negation we have 

S[ka] = 1 or a[ka] = or j[ka] = 1 or a[ka + 1] = 1 (61) 
this disjunction is equivalent to the flowing inequality 

S[ka] + (1 - a[ka]) + j[ka] + a[ka + 1] > 1. (62) 

In summary, we have taken the governing equations for the 
attacker's binary state i32\ and derived an equivalent set of 
inequalities: ( I59l l and ( I62t which can be simplified to the 
inequalities given by equation ( I34> . 
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